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McShane and Wyner (2011) (hereinafter MW2011) demonstrated that in 
many cases a comprehensive data set of p = 1138 proxies [Mann et al. (2008)] 
did not predict Northern Hemisphere (NH) mean temperatures significantly 
better than random numbers. This fact is not very surprising in itself: the 
unsupervised selection of good predictors from a set oi n proxies of vary- 
ing sensitivities might be too challenging a task for any statistical method 
{p/ric ~ 10; only = 119 out of total n = 149 years were used for calibra- 
tion in MW2011 cross- validated reconstructions). However, some types of 
noise'^ systematically outperformed the real proxies (see two bottom panels 
of MW2011, Figure 10). This finding begs further investigation: what do 
these random numbers have that real proxies do not? 

To investigate this question, the present analysis uses ridge regression 
[RR, Hoerl and Kennard (1970)] instead of the Lasso [Tibshirani (1996)]."^ 
The regression model used by MW2011 with Lasso and here with RR is 

y = Xp + /3ol„ + e, 
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■^Pseudoproxies used by MW2011 are called "noise" here; in climate research, pseudo- 
proxies are synthetic combinations of a climate signal with some noise; without the former, 
it is a pure noise. 

*The difference is in the penalty norm: Lasso uses Li while RR uses L2. MW2011 have 
also argued that a rough performance similarity should exist between different methods 
for n problems. 
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where y is a column vector of n observations (annual NH temperatures), e is 
random error, X is a known nx p matrix of predictors (climate proxies). A 
vector of regression coefficients /3 and an intercept constant are to be de- 
termined. A column n-vector 1„ has all components equal one. Proxy records 
are standardized before use; in cross-validation experiments standardization 
is repeated for each calibration period. 

Let w he a column ric-vector such that w'^ln^ = 1- Define matrix- valued 
functions W[w] =1- tn^w^ and TZ[S, A, w] = S^dScc + \iy^yV[w\ + tn^uF ■, 
where 5" is a positive semidefinite n x n matrix, A > is the ridge parame- 
ter found as a minimizer of the generalized cross-validation function [GCV, 
Golub et al. (1979)], matrix (or vector) subscripts cor v hereinafter indicate 
submatrices corresponding to the calibration or validation periods, respec- 
tively. The RR reconstruction jjy of temperatures in the validation period 
(a "holdout block" of = 30 consecutive years) is a linear transformation: 

= 'R-iSp, A, e]yc, where Sp = XX^ /p, X is the standardized version of X, 
and e = n~^ln^- 

Using these formulas, the RR version of the MW2011 cross-validation tests 
were performed for real proxies and for some noise types. Results are shown 
in Figure 1. The cross- validated root mean square error (RMSE) of the RR 
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Fig. 1. Cross-validated RMSE on 120 30-year holdout blocks for the RR reconstructions 
from real climate proxies and from the random noise (one realization for each noise ex- 
periment); cf. MW2011, Figure 9. 
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Fig. 2. Holdout RMSE for RR reconstructions as a function of time for real proxies 
(red) and two IQQ-memher ensemble means: white noise (blue) and AR(1) noise with 
ip = 0.99 (black). The probability limit (p — >■ cxd) for the latter is shown by magenta dashes. 
Holdout RMSE for simple kriging of the NH mean temperature index using an exponential 
semivariogram [Le and Zidek (2006)] 7(t) = Amin + 1 — exp[rln<^] with the GCV-selected 
nugget Amin = ^($,0) and long decorrelation scale —1/ \n{(p) — 99.5 years (t is time in 
years) is shown by the green line. Individual ensemble members are shown by magenta and 
yellow dots, respectively. 

reconstructions are smaller than Lasso values (cf. MW2011, Figure 9), but 
the relative performance in different experiments appears consistent between 
RR and Lasso. As in the Lasso case, noise with high temporal persistence, 
that is, simulated by the Brownian motion or by the first-order autoregres- 
sive process AR(1) with a parameter ip > 0.9, outperformed proxies. Figure 2 
illustrates the time dependence of the holdout error for the real-proxy, white- 
noise, and ip = 0.99 AR(1) cases. There is a general similarity between these 
and the corresponding curves in Figure 10 by MW2011. 

Note that a traditional approach to hypothesis testing would evaluate 
an RMSE corresponding to a regression of temperature data (y) on real 
proxies (X) in the context of the RMSE probability distribution induced 
by the assumed distribution of y under the hypothesized condition (e.g., 
/3 = 0). However, MW2011 evaluate the RMSE of real proxies in the context 
of the RMSE distribution induced by random values in X, not y. Such an 
approach to testing a null hypothesis would be appropriate for an inverse 
relationship, that is, X = y0^ + 1^00 When used with a direct regression 
model here, however, it results in the RMSE distribution with a surprising 
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feature: when p oo, RMSE values for individual realizations of the noise 
matrix X converge in probability to a constant. 

This convergence occurs because the columns x of X in the noise experi- 
ments are i.i.d. from the noise distribution; AR(1) with ip = 0.99 is consid- 
ered here: x ~ 7\A(0,$),$ = The columns of X are i.i.d. too, hence 
the random matrix Sp = XX'^ /p is an average of p i.i.d. variates xx^ . Expec- 
tation ^ = Exx'^ exists; its elements are computed as expectations of ratios 
and first inverse moments of quadratic forms in normal variables [Jones 

(1986, 1987)]. The weak law of large numbers applies, so Sp — )• Since the 
GCV function depends on S and w as well as on A, its minimizing A will 
depend on these parameters too: Amin = i[S,w]. Here GCV is assumed well- 
behaved, so that £ is a single- valued function, continuous at (^',e). From 

the definition of TZ, B[S,e] =TZ[S,i[S,e],e] will also be continuous at 5" = 

p p 
thus Sp implies = B[Sp, e]?/c — > e\yc- 

When p is finite but large, like p = 1138, reconstructions based on individ- 
ual realizations of a noise matrix X are dominated by their constant compo- 
nents, especially when « 1: note the small scatter of RMSE values in the 
ensemble of AR(1) with ip = 0.99 (yellow dots in Figure 2). The probability 
limit ijy = B[^,e]yc yields RMSE values (magenta dash in Figure 2) that are 
very close (1.3-10~^°C RMS difference) to the ensemble mean RMSE (black 
curve in Figure 2). To interpret this non-random reconstruction, consider 
its simpler analogue, using neither proxy standardization nor a regression 

intercept (/3o)- Then, if the assumptions on the GCV function change ac- 

p 1 
cordingly, y„ — )• S[$,0]yc = '^'uci^cc + ^(^)0)/]~ that is, a prediction of 

y„ from yc by "simple kriging" [Stein (1999, page 8)], which in atmospheric 

sciences is called objective analysis or optimal interpolation [Gandin (1963)]. 

The RMSE corresponding to this solution is shown in Figure 2 (green line): 

it is quite close to the ensemble mean RMSE for AR(1) noise with = 0.99 

(RMS difference is 5.4-10-^°C). The solution B['^,e]yc, to which the noise 

reconstructions without simplifications converge as p — )> oo, is more difficult 

to interpret. Still, it has a structure of an objective analysis solution and 

gives results that are similar to simple kriging: the RMS difference between 

the two reconstructions over all holdout blocks is 7.7T0~'^°C. 

Due to the large value of p in the MW2011 experiments, their tests with 

the noise in place of proxies essentially reconstruct holdout temperatures by 

a kriging-like procedure in the temporal dimension. The covariance for this 

reconstruction procedure is set by the temporal autocovariance of the noise. 

Long decorrelation scales {p > 0.95) gave very good results, implying that 

long-range correlation structures carry useful information about predictand 

time series that is not supplied by proxies. By using such a noise for their null 

hypothesis, MW2011 make one skillful model (multivariate linear regression 
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on proxies) compete against another (statistical interpolation in time) and 
conclude that a loser is useless. Such an inference does not seem justified. 

Modern analysis systems do not throw away observations simply because 
they are less skillful than other information sources: instead, they combine 
information. MW2011 experiments have shown that their multivariate re- 
gressions on the proxy data would benefit from additional constraints on 
the temporal variability of the target time series, for example, with an AR 
model. After proxies are combined with such a model, a test for a significance 
of their contributions to the common product could be performed. 
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SUPPLEMENTARY MATERIAL 

Data and codes (DOL 10. 1214/10- AOAS398MSUPP; .zip). This supple- 
ment contains a tar archive with all data files and codes (Matlab scripts) 
needed for reproducing results presented in this discussion. Dependencies 
between files in the archive and the order in which Matlab scripts have to 
be executed are described in the file README_final, also included into the 
archive. 
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